A clinical database to assess action levels and tolerances for the ongoing use of Mobius3D

Abstract In radiation therapy, calculation of dose within the patient contains inherent uncertainties, inaccuracies, limitations, and the potential for random error. Thus, point dose‐independent verification of such calculations is a well‐established process, with published data to support the setting of both action levels and tolerances. Mobius3D takes this process one step further with a full independent calculation of patient dose and comparisons of clinical parameters such as mean target dose and voxel‐by‐voxel gamma analysis. There is currently no published data to directly inform tolerance levels for such parameters, and therefore this work presents a database of 1000 Mobius3D results to fill this gap. The data are tested for normality using a normal probability plot and found to fit this distribution for three sub groups of data; Eclipse,iPlan and the treatment site Lung. The mean (μ) and standard deviation (σ) of these sub groups is used to set action levels and tolerances at μ ± 2σ and μ ± 3σ, respectively. A global (3%, 3 mm) gamma tolerance is set at 88.5%. The mean target dose tolerance for Eclipse data is the narrowest at ± 3%, whilst iPlan and Lung have a range of −5.0 to 2.2% and −1.8 to 5.0%, respectively. With these limits in place, future results failing the action level or tolerance will fall within the worst 5% and 1% of historical results and an informed decision can be made regarding remedial action prior to treatment.

been rudimentary in naturecalculating dose to a single point under simplified scatter conditions. [10][11][12] They are also often based on the same local beam data as the TPS and in practice not fully independent from the original patient-specific dose calculation, and hence do not give a clinically valuable representation of the uncertainties involved.
An extension to this concept is now commercially available in the form of Mobius3D, independent verification software from (Mobius Medical Systems LP, Houston, TX, USA). In contrast to past monitor unit verification programs, this software utilizes the full patient DICOM set (CT, plan, structure set, and dose) to recalculate 3D dose using a collapsed cone convolution superposition (CCCS) algorithm 13 and independent reference beam data. The reference beam models were created based on data from multiple machines and further compared to published standards and Imaging and Radiation Oncology Core (IROC)-Houston consensus data, before being implemented in the Mobius3D system as a golden reference set. This is fundamentally a more rigorous check of the patient dose calculation and, moreover, has the ability to provide information regarding the associated uncertainties of clinical parameters such as mean target dose differences and voxel-by-voxel 3D gamma analysis.
Historical point dose verification has analogies with mean target dose in the sense that they converge if point verification is done for all voxels within the target, and 3D gamma comparisons are analogous to physical measurement-based verification of beam deliveries such as intensity-modulated radiotherapy (IMRT) and volumetricmodulated radiotherapy, although these are often limited to planar 2D gamma comparisons. Both of the above analogies are further limited in the sense that they do not include the actual patient geometry in their calculations.
Mobius3D is designed to detect gross errors in treatment planning and therefore the default action levels are inherently loose in nature, but can be further customized by the end user. The authors believe that with appropriate customization, Mobius3D can not only pick up gross errors but inform clinical decision making due to more subtle TPS limitations and uncertainties. Comprehensive commissioning results have been presented previously 14,15 but neither specifically addressed action levels. The primary aim of the work presented herein is the application of a local clinical database of Mobius3D results, used to inform action levels for these verification parameters. Retrospective statistical analysis is used to tailor action levels based on a number of system parameters (TPS, delivery technique and treatment site). The secondary aim is to not only identify gross errors but to also improve and streamline the treatment planning process in general.
2 | MATERIALS AND METHODS 2.A | Materials: treatment planning and delivery systems As summarized in (Table 1)

2.B | Materials: Mobius3D
The Mobius3D server and dose calculation engine comes preconfigured based on a single (per photon energy) user provided percentage depth dose (PDD) value and consensus beam data. There is an ability to customize discrete values of PDD, output factor, and off axis ratio, although locally this was left unmodified as the Mobius3D data were deemed to be sufficiently close to local reference data, and fur-

| RESULTS
A subset of database results is summarized in (Table 2). Scatter plots of mean dose difference are presented over time (Fig. 1) and further broken down in to linac (1a), delivery technique (1b), and TPS (1c). It is worth noting that the clinical use of M3D saw a somewhat staged roll out starting first with Eclipse and then iPlan. This affected the total number of patients in any given group, as can most clearly be seen in Fig. 1(c). As previously mentioned, all iPlan-based planning utilizes either DCAs or IMRS and therefore the staging has also affected the numbers in these sub groups. Taken in isolation, Fig. 1(a) may indicate that results are getting worse over time but as can be seen in Fig. 1(b-c) the majority of these poorer results were DCAs or IMRS planned in iPlan. In general, these plans will have much smaller field sizes than Eclipse planned 3CDRT/IMRT and are also calculated with a PBC, known to have greater inaccuracies when heterogeneous media are involved.
Normal probability plots are presented in Fig. 2(a-c) for All Data, Eclipse, and iPlan. Mean dose difference histogram results for All Data, Eclipse, iPlan, and the treatment site Lung can be found in Fig. 3(a-d).
All lung patients were exclusively planned in Eclipse with the AAA, as opposed to iPlan's PBC. The gamma results for All Data are presented as a histogram in (Fig. 4). The normal probability plot for All Data and iPlan data do conform to the linear regression, indicating that the results for these sub groups are normal and thus, the l and r can reasonably be used to set tolerance levels. As can be seen in (Fig. 4) the gamma results are skewed due to the nature of the parameter itself, and do not fit a normal distribution. In this case, the application of land r-based limits were chosen for uniformity and to build in the ability to be variable as more data are added to the database.
Based on these data, the locally defined tolerances and action levels are presented in (Table 3) has severe limitations, especially when heterogeneous media (as is often found within a real patient) are involved. [17][18][19] Lung treatments tend to involve water-equivalent targets downstream of relatively low-density media. In situations like this, the AAA of Eclipse has been shown to underpredict dose 5,20 and CCCS overpredict dose, 21  the reality is most likely to be somewhere in between. In the near future the AcurosXB and Monte Carlo dose calculation engines will be implemented in Eclipse and iPlan, respectively, and thus new algorithm and treatment site-specific tolerances will be required. Lastly, it is now possible to access the backend of the Mobius3D database through MATLAB scripting and thus there is potential to automate the presented database and analytic tools. The authors foresee this as a powerful method of simultaneously performing periodic quality assurance on multiple planning systems, verification software, and the planning process in general. This would give clinics the ability to streamline outdated methods and easily track potential change/corruption in many variables over time.

| CONCLUSION
As with any independent verification method, the question arises as to what action is required when a result exceeds the specified tolerance. By implementation of a clinical database this decision becomes an informed one, due to the fact that the tolerance level is based on a large number of historical results. Following the work presented herein, if a future result(s) falls outside the action level or tolerance, then a good level of confidence can be given to the fact that it is within the worst 5% and 1% of all expected results, respectively. A multidisciplinary approach can then be utilized to make an informed decision regarding the clinical significance and if remedial action is required before treatment proceeds.

CONF LICT OF I NTEREST
The authors declare no conflict of interest.